In [1]:

Step 1: Preparation and Analyzing the Dataset.

In [2]:
In [3]:
Out[3]:
VIN (1-10) County City State Postal Code Model Year Make Model Electric Vehicle Type Clean Alternative Fuel Vehicle (CAFV) Eligibility Electric Range Base MSRP Legislative District DOL Vehicle ID Vehicle Location Electric Utility 2020 Census Tract
0 3C3CFFGE4E Yakima Yakima WA 98902.0 2014 FIAT 500 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 87 0 14.0 1593721 POINT (-120.524012 46.5973939) PACIFICORP 5.307700e+10
1 5YJXCBE40H Thurston Olympia WA 98513.0 2017 TESLA MODEL X Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 200 0 2.0 257167501 POINT (-122.817545 46.98876) PUGET SOUND ENERGY INC 5.306701e+10
2 3MW39FS03P King Renton WA 98058.0 2023 BMW 330E Plug-in Hybrid Electric Vehicle (PHEV) Not eligible due to low battery range 20 0 11.0 224071816 POINT (-122.1298876 47.4451257) PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 5.303303e+10
3 7PDSGABA8P Snohomish Bothell WA 98012.0 2023 RIVIAN R1S Battery Electric Vehicle (BEV) Eligibility unknown as battery range has not b... 0 0 21.0 260084653 POINT (-122.1873 47.820245) PUGET SOUND ENERGY INC 5.306105e+10
4 5YJ3E1EB8L King Kent WA 98031.0 2020 TESLA MODEL 3 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 322 0 33.0 253771913 POINT (-122.2012521 47.3931814) PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 5.303303e+10
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
166795 3FA6P0SU4D Spokane Spokane WA 99223.0 2013 FORD FUSION Plug-in Hybrid Electric Vehicle (PHEV) Not eligible due to low battery range 19 0 6.0 239527123 POINT (-117.369705 47.62637) BONNEVILLE POWER ADMINISTRATION||AVISTA CORP||... 5.306300e+10
166796 5YJYGDEE5M King Sammamish WA 98074.0 2021 TESLA MODEL Y Battery Electric Vehicle (BEV) Eligibility unknown as battery range has not b... 0 0 45.0 148715479 POINT (-122.0313266 47.6285782) PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 5.303303e+10
166797 7SAYGDEE5N Snohomish Mukilteo WA 98275.0 2022 TESLA MODEL Y Battery Electric Vehicle (BEV) Eligibility unknown as battery range has not b... 0 0 21.0 220504406 POINT (-122.299965 47.94171) PUGET SOUND ENERGY INC 5.306104e+10
166798 1G1RH6E43D Lewis Mossyrock WA 98564.0 2013 CHEVROLET VOLT Plug-in Hybrid Electric Vehicle (PHEV) Clean Alternative Fuel Vehicle Eligible 38 0 20.0 156418475 POINT (-122.487535 46.5290135) BONNEVILLE POWER ADMINISTRATION||CITY OF TACOM... 5.304197e+10
166799 5YJSA1E27H Pierce Gig Harbor WA 98332.0 2017 TESLA MODEL S Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 210 0 26.0 169045789 POINT (-122.589645 47.342345) BONNEVILLE POWER ADMINISTRATION||CITY OF TACOM... 5.305307e+10

166800 rows × 17 columns

In [4]:
In [5]:
Out[5]:
County Postal Code Model Year Make Model Electric Vehicle Type Clean Alternative Fuel Vehicle (CAFV) Eligibility Electric Range Base MSRP Legislative District Electric Utility
0 Yakima 98902.0 2014 FIAT 500 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 87 0 14.0 PACIFICORP
1 Thurston 98513.0 2017 TESLA MODEL X Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 200 0 2.0 PUGET SOUND ENERGY INC
2 King 98058.0 2023 BMW 330E Plug-in Hybrid Electric Vehicle (PHEV) Not eligible due to low battery range 20 0 11.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)
3 Snohomish 98012.0 2023 RIVIAN R1S Battery Electric Vehicle (BEV) Eligibility unknown as battery range has not b... 0 0 21.0 PUGET SOUND ENERGY INC
4 King 98031.0 2020 TESLA MODEL 3 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 322 0 33.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)
... ... ... ... ... ... ... ... ... ... ... ...
166795 Spokane 99223.0 2013 FORD FUSION Plug-in Hybrid Electric Vehicle (PHEV) Not eligible due to low battery range 19 0 6.0 BONNEVILLE POWER ADMINISTRATION||AVISTA CORP||...
166796 King 98074.0 2021 TESLA MODEL Y Battery Electric Vehicle (BEV) Eligibility unknown as battery range has not b... 0 0 45.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)
166797 Snohomish 98275.0 2022 TESLA MODEL Y Battery Electric Vehicle (BEV) Eligibility unknown as battery range has not b... 0 0 21.0 PUGET SOUND ENERGY INC
166798 Lewis 98564.0 2013 CHEVROLET VOLT Plug-in Hybrid Electric Vehicle (PHEV) Clean Alternative Fuel Vehicle Eligible 38 0 20.0 BONNEVILLE POWER ADMINISTRATION||CITY OF TACOM...
166799 Pierce 98332.0 2017 TESLA MODEL S Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 210 0 26.0 BONNEVILLE POWER ADMINISTRATION||CITY OF TACOM...

166800 rows × 11 columns

In [6]:
Out[6]:
11
In [7]:
Out[7]:
(166800, 11)
In [8]:
Out[8]:
County Postal Code Model Year Make Model Electric Vehicle Type Clean Alternative Fuel Vehicle (CAFV) Eligibility Electric Range Base MSRP Legislative District Electric Utility
0 Yakima 98902.0 2014 FIAT 500 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 87 0 14.0 PACIFICORP
1 Thurston 98513.0 2017 TESLA MODEL X Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 200 0 2.0 PUGET SOUND ENERGY INC
2 King 98058.0 2023 BMW 330E Plug-in Hybrid Electric Vehicle (PHEV) Not eligible due to low battery range 20 0 11.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)
3 Snohomish 98012.0 2023 RIVIAN R1S Battery Electric Vehicle (BEV) Eligibility unknown as battery range has not b... 0 0 21.0 PUGET SOUND ENERGY INC
4 King 98031.0 2020 TESLA MODEL 3 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 322 0 33.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)
In [9]:
There are 187 nos of unique values in County column out of 166800
There are 836 nos of unique values in Postal Code column out of 166800
There are 22 nos of unique values in Model Year column out of 166800
There are 39 nos of unique values in Make column out of 166800
There are 138 nos of unique values in Model column out of 166800
There are 2 nos of unique values in Electric Vehicle Type column out of 166800
There are 3 nos of unique values in Clean Alternative Fuel Vehicle (CAFV) Eligibility column out of 166800
There are 102 nos of unique values in Electric Range column out of 166800
There are 31 nos of unique values in Base MSRP column out of 166800
There are 49 nos of unique values in Legislative District column out of 166800
There are 76 nos of unique values in Electric Utility column out of 166800
In [10]:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 166800 entries, 0 to 166799
Data columns (total 11 columns):
 #   Column                                             Non-Null Count   Dtype  
---  ------                                             --------------   -----  
 0   County                                             166795 non-null  object 
 1   Postal Code                                        166795 non-null  float64
 2   Model Year                                         166800 non-null  int64  
 3   Make                                               166800 non-null  object 
 4   Model                                              166800 non-null  object 
 5   Electric Vehicle Type                              166800 non-null  object 
 6   Clean Alternative Fuel Vehicle (CAFV) Eligibility  166800 non-null  object 
 7   Electric Range                                     166800 non-null  int64  
 8   Base MSRP                                          166800 non-null  int64  
 9   Legislative District                               166440 non-null  float64
 10  Electric Utility                                   166795 non-null  object 
dtypes: float64(2), int64(3), object(6)
memory usage: 14.0+ MB
In [11]:
Out[11]:
  Postal Code Model Year Electric Range Base MSRP Legislative District
count 166795.000000 166800.000000 166800.000000 166800.000000 166440.000000
mean 98173.713750 2020.341793 61.508993 1152.723171 29.178941
std 2442.584415 3.001465 93.271747 8661.081091 14.853534
min 1730.000000 1997.000000 0.000000 0.000000 1.000000
25% 98052.000000 2018.000000 0.000000 0.000000 18.000000
50% 98122.000000 2021.000000 0.000000 0.000000 33.000000
75% 98371.000000 2023.000000 84.000000 0.000000 42.000000
max 99577.000000 2024.000000 337.000000 845000.000000 49.000000

step 2 : Handling the missing values.

In [12]:
Out[12]:
County                                                 5
Postal Code                                            5
Model Year                                             0
Make                                                   0
Model                                                  0
Electric Vehicle Type                                  0
Clean Alternative Fuel Vehicle (CAFV) Eligibility      0
Electric Range                                         0
Base MSRP                                              0
Legislative District                                 360
Electric Utility                                       5
dtype: int64
In [13]:
Out[13]:
Legislative District                                 0.002158
County                                               0.000030
Postal Code                                          0.000030
Electric Utility                                     0.000030
Model Year                                           0.000000
Make                                                 0.000000
Model                                                0.000000
Electric Vehicle Type                                0.000000
Clean Alternative Fuel Vehicle (CAFV) Eligibility    0.000000
Electric Range                                       0.000000
Base MSRP                                            0.000000
dtype: float64
In [14]:
Out[14]:
Legislative District                                 0.002158
County                                               0.000030
Postal Code                                          0.000030
Electric Utility                                     0.000030
Model Year                                           0.000000
Make                                                 0.000000
Model                                                0.000000
Electric Vehicle Type                                0.000000
Clean Alternative Fuel Vehicle (CAFV) Eligibility    0.000000
Electric Range                                       0.000000
Base MSRP                                            0.000000
dtype: float64
In [15]:
Out[15]:
<Axes: >
In [16]:
Out[16]:
County                                               0.002998
Postal Code                                          0.002998
Model Year                                           0.000000
Make                                                 0.000000
Model                                                0.000000
Electric Vehicle Type                                0.000000
Clean Alternative Fuel Vehicle (CAFV) Eligibility    0.000000
Electric Range                                       0.000000
Base MSRP                                            0.000000
Legislative District                                 0.215827
Electric Utility                                     0.002998
dtype: float64
In [17]:
Out[17]:
<Axes: >
In [18]:
In [19]:
In [20]:
In [21]:
In [22]:
In [23]:
Out[23]:
County                                               0
Postal Code                                          0
Model Year                                           0
Make                                                 0
Model                                                0
Electric Vehicle Type                                0
Clean Alternative Fuel Vehicle (CAFV) Eligibility    0
Electric Range                                       0
Base MSRP                                            0
Legislative District                                 0
Electric Utility                                     0
dtype: int64
In [24]:
Out[24]:
array([208,  14,  17,  93,  19,  32,  16, 111,  15, 220,  12, 245, 265,
        33, 100,  31], dtype=int64)
In [25]:
Out[25]:
array(['MODEL S', '330E', 'XC60', 'SOUL', 'CROSSTREK', 'XC90', 'PANAMERA',
       'PACIFICA', '530E', 'SOUL EV', 'ROADSTER', 'COUNTRYMAN', '740E',
       'CAYENNE', 'KARMA', 'WHEEGO', 'CT6', '918'], dtype=object)
In [26]:
Out[26]:
array([13.        , 47.        , 30.        , 41.        , 48.        ,
       23.        , 20.        ,  1.        , 45.        , 44.        ,
       18.        , 46.        , 35.        , 37.        ,  2.        ,
       26.        , 15.        , 36.        , 32.        , 11.        ,
       22.        , 34.        , 49.        ,  5.        , 43.        ,
       17.        , 24.        , 19.        , 33.        , 21.        ,
       10.        , 31.        , 14.        ,  7.        , 39.        ,
       29.        ,  6.        , 25.        , 40.        , 27.        ,
       38.        , 16.        , 28.        , 42.        , 12.        ,
        3.        ,  9.        ,  4.        ,  8.        , 29.17894136])
In [27]:
Out[27]:
array(['TESLA', 'BMW', 'VOLVO', 'KIA', 'SUBARU', 'PORSCHE', 'CHRYSLER',
       'MINI', 'FISKER', 'WHEEGO ELECTRIC CARS', 'CADILLAC'], dtype=object)
In [28]:
In [29]:
Cleaned Dataset:
Out[29]:
County Postal Code Model Year Make Model Electric Vehicle Type Clean Alternative Fuel Vehicle (CAFV) Eligibility Electric Range Base MSRP Legislative District Electric Utility
67 Kittitas 98940.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 208 69900 13.0 PUGET SOUND ENERGY INC
152 King 98092.0 2017 BMW 330E Plug-in Hybrid Electric Vehicle (PHEV) Not eligible due to low battery range 14 44100 47.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)
204 King 98023.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 208 69900 30.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)
317 King 98059.0 2013 TESLA MODEL S Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 208 69900 41.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)
352 King 98004.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 208 69900 41.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)

Label Encoding for categorical variables with ordinality

In [30]:
Out[30]:
County Postal Code Model Year Make Model Electric Vehicle Type Clean Alternative Fuel Vehicle (CAFV) Eligibility Electric Range Base MSRP Legislative District Electric Utility
67 Kittitas 98940.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 13.0 PUGET SOUND ENERGY INC
152 King 98092.0 2017 BMW 330E Plug-in Hybrid Electric Vehicle (PHEV) 1 14 44100 47.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)
204 King 98023.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 30.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)
317 King 98059.0 2013 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 41.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)
352 King 98004.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 41.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)

Feature Engineering:

In [31]:
Out[31]:
County Postal Code Model Year Make Model Electric Vehicle Type Clean Alternative Fuel Vehicle (CAFV) Eligibility Electric Range Base MSRP Legislative District Electric Utility Vehicle Age
67 Kittitas 98940.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 13.0 PUGET SOUND ENERGY INC 10
152 King 98092.0 2017 BMW 330E Plug-in Hybrid Electric Vehicle (PHEV) 1 14 44100 47.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 7
204 King 98023.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 30.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 10
317 King 98059.0 2013 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 41.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 11
352 King 98004.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 41.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 10
In [32]:
Out[32]:
County Postal Code Model Year Make Model Electric Vehicle Type Clean Alternative Fuel Vehicle (CAFV) Eligibility Electric Range Base MSRP Legislative District Electric Utility Vehicle Age Is Luxury Brand
67 Kittitas 98940.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 13.0 PUGET SOUND ENERGY INC 10 1
152 King 98092.0 2017 BMW 330E Plug-in Hybrid Electric Vehicle (PHEV) 1 14 44100 47.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 7 1
204 King 98023.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 30.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 10 1
317 King 98059.0 2013 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 41.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 11 1
352 King 98004.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 41.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 10 1
In [33]:
Out[33]:
County Postal Code Model Year Make Model Electric Vehicle Type Clean Alternative Fuel Vehicle (CAFV) Eligibility Electric Range Base MSRP Legislative District Electric Utility Vehicle Age Is Luxury Brand
67 Kittitas 98940.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 13.0 PUGET SOUND ENERGY INC 10 1
152 King 98092.0 2017 BMW 330E Plug-in Hybrid Electric Vehicle (PHEV) 1 14 44100 47.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 7 1
204 King 98023.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 30.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 10 1
317 King 98059.0 2013 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 41.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 11 1
352 King 98004.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 41.0 PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA) 10 1
... ... ... ... ... ... ... ... ... ... ... ... ... ...
166374 Walla Walla 99362.0 2013 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 16.0 PACIFICORP 11 1
166399 Whatcom 98225.0 2019 SUBARU CROSSTREK Plug-in Hybrid Electric Vehicle (PHEV) 1 17 34995 42.0 PUGET SOUND ENERGY INC||PUD NO 1 OF WHATCOM CO... 5 0
166567 Clark 98661.0 2014 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 49.0 BONNEVILLE POWER ADMINISTRATION||PUD NO 1 OF C... 10 1
166604 Grant 98837.0 2013 TESLA MODEL S Battery Electric Vehicle (BEV) 0 208 69900 13.0 PUD NO 2 OF GRANT COUNTY 11 1
166718 San Juan 98245.0 2020 PORSCHE CAYENNE Plug-in Hybrid Electric Vehicle (PHEV) 1 14 81100 40.0 BONNEVILLE POWER ADMINISTRATION||ORCAS POWER &... 4 0

3363 rows × 13 columns

Outlier Detection and Handling.

In [34]:
In [35]:
Current Directory: C:\Users\sharm
In [36]:
In [37]:
In [38]:
C:\Users\sharm\anaconda3\Lib\site-packages\dask\utils.py:1327: FutureWarning: Automatic reindexing on DataFrame vs Series comparisons is deprecated and will raise ValueError in a future version. Do `left, right = left.align(right, axis=1, copy=False)` before e.g. `left == right`
  return function(*args2, **kwargs)
C:\Users\sharm\anaconda3\Lib\site-packages\dask\utils.py:1327: FutureWarning: Automatic reindexing on DataFrame vs Series comparisons is deprecated and will raise ValueError in a future version. Do `left, right = left.align(right, axis=1, copy=False)` before e.g. `left == right`
  return function(*args2, **kwargs)
Potential Outliers based on Z-scores:
        County Postal Code  Model Year     Make       Model  \
9         King       98011        2019   SUBARU   CROSSTREK   
12       Clark       98606        2018  PORSCHE    PANAMERA   
46        King       98033        2008    TESLA    ROADSTER   
60      Kitsap       98110        2018     MINI  COUNTRYMAN   
91        King       98177        2018     MINI  COUNTRYMAN   
...        ...         ...         ...      ...         ...   
3286    Pierce       98405        2019     MINI  COUNTRYMAN   
3301    Benton       99352        2019     MINI  COUNTRYMAN   
3314    Pierce       98332        2019   SUBARU   CROSSTREK   
3359   Whatcom       98225        2019   SUBARU   CROSSTREK   
3362  San Juan       98245        2020  PORSCHE     CAYENNE   

                       Electric Vehicle Type  \
9     Plug-in Hybrid Electric Vehicle (PHEV)   
12    Plug-in Hybrid Electric Vehicle (PHEV)   
46            Battery Electric Vehicle (BEV)   
60    Plug-in Hybrid Electric Vehicle (PHEV)   
91    Plug-in Hybrid Electric Vehicle (PHEV)   
...                                      ...   
3286  Plug-in Hybrid Electric Vehicle (PHEV)   
3301  Plug-in Hybrid Electric Vehicle (PHEV)   
3314  Plug-in Hybrid Electric Vehicle (PHEV)   
3359  Plug-in Hybrid Electric Vehicle (PHEV)   
3362  Plug-in Hybrid Electric Vehicle (PHEV)   

      Clean Alternative Fuel Vehicle (CAFV) Eligibility  Electric Range  \
9                                                     1              17   
12                                                    1              14   
46                                                    0             220   
60                                                    1              12   
91                                                    1              12   
...                                                 ...             ...   
3286                                                  1              12   
3301                                                  1              12   
3314                                                  1              17   
3359                                                  1              17   
3362                                                  1              14   

      Base MSRP  Legislative District  \
9         34995                   1.0   
12       184400                  18.0   
46        98950                  45.0   
60        36800                  23.0   
91        36800                  32.0   
...         ...                   ...   
3286      36900                  27.0   
3301      36900                   8.0   
3314      34995                  26.0   
3359      34995                  42.0   
3362      81100                  40.0   

                                       Electric Utility  Vehicle Age  \
9         PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)            5   
12    BONNEVILLE POWER ADMINISTRATION||PUD NO 1 OF C...            6   
46        PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)           16   
60                               PUGET SOUND ENERGY INC            6   
91         CITY OF SEATTLE - (WA)|CITY OF TACOMA - (WA)            6   
...                                                 ...          ...   
3286  BONNEVILLE POWER ADMINISTRATION||CITY OF TACOM...            5   
3301  BONNEVILLE POWER ADMINISTRATION||CITY OF RICHL...            5   
3314  BONNEVILLE POWER ADMINISTRATION||CITY OF TACOM...            5   
3359  PUGET SOUND ENERGY INC||PUD NO 1 OF WHATCOM CO...            5   
3362  BONNEVILLE POWER ADMINISTRATION||ORCAS POWER &...            4   

      Is Luxury Brand  
9                   0  
12                  0  
46                  1  
60                  0  
91                  0  
...               ...  
3286                0  
3301                0  
3314                0  
3359                0  
3362                0  

[306 rows x 13 columns]
In [39]:
Potential Outliers based on IQR:
        County Postal Code  Model Year     Make       Model  \
9         King       98011        2019   SUBARU   CROSSTREK   
12       Clark       98606        2018  PORSCHE    PANAMERA   
60      Kitsap       98110        2018     MINI  COUNTRYMAN   
91        King       98177        2018     MINI  COUNTRYMAN   
125       King       98102        2019   SUBARU   CROSSTREK   
...        ...         ...         ...      ...         ...   
3286    Pierce       98405        2019     MINI  COUNTRYMAN   
3301    Benton       99352        2019     MINI  COUNTRYMAN   
3314    Pierce       98332        2019   SUBARU   CROSSTREK   
3359   Whatcom       98225        2019   SUBARU   CROSSTREK   
3362  San Juan       98245        2020  PORSCHE     CAYENNE   

                       Electric Vehicle Type  \
9     Plug-in Hybrid Electric Vehicle (PHEV)   
12    Plug-in Hybrid Electric Vehicle (PHEV)   
60    Plug-in Hybrid Electric Vehicle (PHEV)   
91    Plug-in Hybrid Electric Vehicle (PHEV)   
125   Plug-in Hybrid Electric Vehicle (PHEV)   
...                                      ...   
3286  Plug-in Hybrid Electric Vehicle (PHEV)   
3301  Plug-in Hybrid Electric Vehicle (PHEV)   
3314  Plug-in Hybrid Electric Vehicle (PHEV)   
3359  Plug-in Hybrid Electric Vehicle (PHEV)   
3362  Plug-in Hybrid Electric Vehicle (PHEV)   

      Clean Alternative Fuel Vehicle (CAFV) Eligibility  Electric Range  \
9                                                     1              17   
12                                                    1              14   
60                                                    1              12   
91                                                    1              12   
125                                                   1              17   
...                                                 ...             ...   
3286                                                  1              12   
3301                                                  1              12   
3314                                                  1              17   
3359                                                  1              17   
3362                                                  1              14   

      Base MSRP  Legislative District  \
9         34995                   1.0   
12       184400                  18.0   
60        36800                  23.0   
91        36800                  32.0   
125       34995                  43.0   
...         ...                   ...   
3286      36900                  27.0   
3301      36900                   8.0   
3314      34995                  26.0   
3359      34995                  42.0   
3362      81100                  40.0   

                                       Electric Utility  Vehicle Age  \
9         PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)            5   
12    BONNEVILLE POWER ADMINISTRATION||PUD NO 1 OF C...            6   
60                               PUGET SOUND ENERGY INC            6   
91         CITY OF SEATTLE - (WA)|CITY OF TACOMA - (WA)            6   
125        CITY OF SEATTLE - (WA)|CITY OF TACOMA - (WA)            5   
...                                                 ...          ...   
3286  BONNEVILLE POWER ADMINISTRATION||CITY OF TACOM...            5   
3301  BONNEVILLE POWER ADMINISTRATION||CITY OF RICHL...            5   
3314  BONNEVILLE POWER ADMINISTRATION||CITY OF TACOM...            5   
3359  PUGET SOUND ENERGY INC||PUD NO 1 OF WHATCOM CO...            5   
3362  BONNEVILLE POWER ADMINISTRATION||ORCAS POWER &...            4   

      Is Luxury Brand  
9                   0  
12                  0  
60                  0  
91                  0  
125                 0  
...               ...  
3286                0  
3301                0  
3314                0  
3359                0  
3362                0  

[286 rows x 13 columns]
In [40]:
Lower bound for outliers: -94577.5
Upper bound for outliers: 204472.5
In [41]:
Merged Outliers:
        County Postal Code  Model Year     Make       Model  \
0         King       98011        2019   SUBARU   CROSSTREK   
1        Clark       98606        2018  PORSCHE    PANAMERA   
2         King       98033        2008    TESLA    ROADSTER   
3       Kitsap       98110        2018     MINI  COUNTRYMAN   
4         King       98177        2018     MINI  COUNTRYMAN   
..         ...         ...         ...      ...         ...   
262  Klickitat       98620        2018     MINI  COUNTRYMAN   
263     Benton       99352        2019     MINI  COUNTRYMAN   
264     Pierce       98332        2019   SUBARU   CROSSTREK   
265    Whatcom       98225        2019   SUBARU   CROSSTREK   
266   San Juan       98245        2020  PORSCHE     CAYENNE   

                      Electric Vehicle Type  \
0    Plug-in Hybrid Electric Vehicle (PHEV)   
1    Plug-in Hybrid Electric Vehicle (PHEV)   
2            Battery Electric Vehicle (BEV)   
3    Plug-in Hybrid Electric Vehicle (PHEV)   
4    Plug-in Hybrid Electric Vehicle (PHEV)   
..                                      ...   
262  Plug-in Hybrid Electric Vehicle (PHEV)   
263  Plug-in Hybrid Electric Vehicle (PHEV)   
264  Plug-in Hybrid Electric Vehicle (PHEV)   
265  Plug-in Hybrid Electric Vehicle (PHEV)   
266  Plug-in Hybrid Electric Vehicle (PHEV)   

     Clean Alternative Fuel Vehicle (CAFV) Eligibility  Electric Range  \
0                                                    1              17   
1                                                    1              14   
2                                                    0             220   
3                                                    1              12   
4                                                    1              12   
..                                                 ...             ...   
262                                                  1              12   
263                                                  1              12   
264                                                  1              17   
265                                                  1              17   
266                                                  1              14   

     Base MSRP  Legislative District  \
0        34995                   1.0   
1       184400                  18.0   
2        98950                  45.0   
3        36800                  23.0   
4        36800                  32.0   
..         ...                   ...   
262      36800                  14.0   
263      36900                   8.0   
264      34995                  26.0   
265      34995                  42.0   
266      81100                  40.0   

                                      Electric Utility  Vehicle Age  \
0        PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)            5   
1    BONNEVILLE POWER ADMINISTRATION||PUD NO 1 OF C...            6   
2        PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA)           16   
3                               PUGET SOUND ENERGY INC            6   
4         CITY OF SEATTLE - (WA)|CITY OF TACOMA - (WA)            6   
..                                                 ...          ...   
262  BONNEVILLE POWER ADMINISTRATION||PUD NO 1 OF K...            6   
263  BONNEVILLE POWER ADMINISTRATION||CITY OF RICHL...            5   
264  BONNEVILLE POWER ADMINISTRATION||CITY OF TACOM...            5   
265  PUGET SOUND ENERGY INC||PUD NO 1 OF WHATCOM CO...            5   
266  BONNEVILLE POWER ADMINISTRATION||ORCAS POWER &...            4   

     Is Luxury Brand Outlier Detection Method  
0                  0                  Z-score  
1                  0                  Z-score  
2                  1                  Z-score  
3                  0                  Z-score  
4                  0                  Z-score  
..               ...                      ...  
262                0                  Z-score  
263                0                  Z-score  
264                0                  Z-score  
265                0                  Z-score  
266                0                  Z-score  

[267 rows x 14 columns]
In [42]:
In [43]:
In [44]:
Out[44]:
array([2.22509516e+00, 3.66247109e+01, 1.46271650e+00, ...,
       1.17057172e+00, 2.20681253e+00, 7.41553461e+03])
In [45]:
In [46]:
In [47]:
In [48]:
Out[48]:
Electric Vehicle Type Number Of Vehicles
0 Battery Electric Vehicle (BEV) 2154
1 Plug-in Hybrid Electric Vehicle (PHEV) 1209
In [49]:
In [50]:
In [51]:
C:\Users\sharm\AppData\Local\Temp\ipykernel_10336\2297688952.py:58: FutureWarning: The default value of numeric_only in DataFrame.corr is deprecated. In a future version, it will default to False. Select only valid columns or specify the value of numeric_only to silence this warning.
  sns.heatmap(data=df2.corr(), annot=True, cmap='coolwarm')
C:\Users\sharm\AppData\Local\Temp\ipykernel_10336\2297688952.py:58: FutureWarning: The default value of numeric_only in DataFrame.corr is deprecated. In a future version, it will default to False. Select only valid columns or specify the value of numeric_only to silence this warning.
  sns.heatmap(data=df2.corr(), annot=True, cmap='coolwarm')
C:\Users\sharm\AppData\Local\Temp\ipykernel_10336\2297688952.py:58: FutureWarning: The default value of numeric_only in DataFrame.corr is deprecated. In a future version, it will default to False. Select only valid columns or specify the value of numeric_only to silence this warning.
  sns.heatmap(data=df2.corr(), annot=True, cmap='coolwarm')
C:\Users\sharm\AppData\Local\Temp\ipykernel_10336\2297688952.py:58: FutureWarning: The default value of numeric_only in DataFrame.corr is deprecated. In a future version, it will default to False. Select only valid columns or specify the value of numeric_only to silence this warning.
  sns.heatmap(data=df2.corr(), annot=True, cmap='coolwarm')
C:\Users\sharm\AppData\Local\Temp\ipykernel_10336\2297688952.py:58: FutureWarning: The default value of numeric_only in DataFrame.corr is deprecated. In a future version, it will default to False. Select only valid columns or specify the value of numeric_only to silence this warning.
  sns.heatmap(data=df2.corr(), annot=True, cmap='coolwarm')
C:\Users\sharm\AppData\Local\Temp\ipykernel_10336\2297688952.py:58: FutureWarning: The default value of numeric_only in DataFrame.corr is deprecated. In a future version, it will default to False. Select only valid columns or specify the value of numeric_only to silence this warning.
  sns.heatmap(data=df2.corr(), annot=True, cmap='coolwarm')
In [52]:
In [53]:
In [54]:
In [55]:
In [56]:
       PCA1      PCA2      PCA3      PCA4
0  1.556093 -0.408626  1.161561  0.093105
1 -1.794899  0.282176 -1.206265  0.644373
2  1.583671  0.093212  0.105939  0.040738
3  1.996461  0.477277 -0.529938 -0.068637
4  1.601515  0.417931 -0.577110  0.006853
In [57]:
Out[57]:
PCA1 PCA2 PCA3 PCA4
0 1.556093 -0.408626 1.161561 0.093105
1 -1.794899 0.282176 -1.206265 0.644373
2 1.583671 0.093212 0.105939 0.040738
3 1.996461 0.477277 -0.529938 -0.068637
4 1.601515 0.417931 -0.577110 0.006853
... ... ... ... ...
3358 1.955905 -0.260721 1.022446 0.008374
3359 -3.493953 1.686698 -0.074403 -2.116392
3360 1.614493 0.654090 -1.073873 -0.017790
3361 1.951039 -0.349280 1.208732 0.017615
3362 -3.414608 2.905715 0.591203 -0.962471

3363 rows × 4 columns

In [58]:
In [59]:
In [60]:
In [61]:
Correlation Matrix between Principal Components:
Out[61]:
0 1 2 3 4 5 6
0 1.000000e+00 4.731081e-17 -1.849297e-17 7.371007e-17 2.889920e-17 -3.326304e-17 1.321415e-16
1 4.731081e-17 1.000000e+00 -5.506418e-16 6.162041e-16 3.412085e-16 1.174665e-16 -5.474931e-17
2 -1.849297e-17 -5.506418e-16 1.000000e+00 6.907374e-17 3.881163e-16 2.448669e-16 1.388946e-16
3 7.371007e-17 6.162041e-16 6.907374e-17 1.000000e+00 2.507575e-16 -2.736567e-16 6.580812e-17
4 2.889920e-17 3.412085e-16 3.881163e-16 2.507575e-16 1.000000e+00 2.916532e-16 -6.528461e-17
5 -3.326304e-17 1.174665e-16 2.448669e-16 -2.736567e-16 2.916532e-16 1.000000e+00 1.431397e-17
6 1.321415e-16 -5.474931e-17 1.388946e-16 6.580812e-17 -6.528461e-17 1.431397e-17 1.000000e+00
In [62]:
In [63]:
In [64]:
In [65]:
In [66]:
Accuracy with selected features: 1.0
Selected Features: Index(['Model Year', 'Make', 'Electric Range'], dtype='object')
In [67]:
C:\Users\sharm\AppData\Local\Temp\ipykernel_10336\1837273610.py:166: FutureWarning:

pivot_table dropped a column because it failed to aggregate. This behavior is deprecated and will raise in a future version of pandas. Select only the columns that can be aggregated.

In [68]:
                      Feature          Score
0                      County      31.541272
1                 Postal Code     185.209996
2                  Model Year       5.462177
3                       Model    1265.951520
4       Electric Vehicle Type    1780.989285
5              Electric Range  145995.344086
6                   Base MSRP  464565.444786
7        Legislative District      13.562450
8            Electric Utility      34.007947
9                 Vehicle Age    1309.507885
10            Is Luxury Brand      40.051115
11                   Make_BMW    1092.475610
12              Make_CADILLAC       6.497170
13              Make_CHRYSLER      51.513278
14                Make_FISKER       6.961254
15                   Make_KIA     275.665651
16                  Make_MINI     331.836773
17               Make_PORSCHE      71.107880
18                Make_SUBARU     144.370544
19                 Make_TESLA     723.970396
20                 Make_VOLVO     657.209193
21  Make_WHEEGO ELECTRIC CARS       1.392251
(3363, 22)
In [69]:
                      Feature  Correlation with Target  Absolute Correlation
4       Electric Vehicle Type                 0.909301              0.909301
5              Electric Range                -0.809810              0.809810
9                 Vehicle Age                -0.730987              0.730987
2                  Model Year                 0.730987              0.730987
19                 Make_TESLA                -0.633669              0.633669
11                   Make_BMW                 0.618482              0.618482
20                 Make_VOLVO                 0.463589              0.463589
3                       Model                -0.411136              0.411136
10            Is Luxury Brand                -0.374218              0.374218
16                  Make_MINI                 0.321571              0.321571
15                   Make_KIA                -0.315522              0.315522
18                Make_SUBARU                 0.209289              0.209289
17               Make_PORSCHE                 0.146129              0.146129
13              Make_CHRYSLER                -0.125859              0.125859
6                   Base MSRP                -0.123516              0.123516
0                      County                -0.050266              0.050266
8            Electric Utility                -0.049671              0.049671
14                Make_FISKER                -0.045599              0.045599
12              Make_CADILLAC                -0.044046              0.044046
1                 Postal Code                 0.028651              0.028651
7        Legislative District                -0.024082              0.024082
21  Make_WHEEGO ELECTRIC CARS                -0.020356              0.020356
In [70]:
Requirement already satisfied: tabulate in c:\users\sharm\anaconda3\lib\site-packages (0.8.10)
Note: you may need to restart the kernel to use updated packages.
In [71]:
Correlation table for Vehicle Age:
+----+---------------------------------------------------+--------------------------------+------------------------+
|    | Feature                                           |   Correlation with Vehicle Age |   Absolute Correlation |
+====+===================================================+================================+========================+
|  2 | Model Year                                        |                   -1           |            1           |
+----+---------------------------------------------------+--------------------------------+------------------------+
| 17 | Make_TESLA                                        |                    0.890469    |            0.890469    |
+----+---------------------------------------------------+--------------------------------+------------------------+
|  4 | Electric Vehicle Type                             |                   -0.793498    |            0.793498    |
+----+---------------------------------------------------+--------------------------------+------------------------+
|  5 | Clean Alternative Fuel Vehicle (CAFV) Eligibility |                   -0.730987    |            0.730987    |
+----+---------------------------------------------------+--------------------------------+------------------------+
|  6 | Base MSRP                                         |                    0.464787    |            0.464787    |
+----+---------------------------------------------------+--------------------------------+------------------------+
|  9 | Make_BMW                                          |                   -0.399042    |            0.399042    |
+----+---------------------------------------------------+--------------------------------+------------------------+
| 18 | Make_VOLVO                                        |                   -0.344391    |            0.344391    |
+----+---------------------------------------------------+--------------------------------+------------------------+
| 14 | Make_MINI                                         |                   -0.270533    |            0.270533    |
+----+---------------------------------------------------+--------------------------------+------------------------+
| 11 | Make_CHRYSLER                                     |                   -0.254328    |            0.254328    |
+----+---------------------------------------------------+--------------------------------+------------------------+
| 16 | Make_SUBARU                                       |                   -0.196269    |            0.196269    |
+----+---------------------------------------------------+--------------------------------+------------------------+
| 13 | Make_KIA                                          |                   -0.166008    |            0.166008    |
+----+---------------------------------------------------+--------------------------------+------------------------+
| 15 | Make_PORSCHE                                      |                   -0.146789    |            0.146789    |
+----+---------------------------------------------------+--------------------------------+------------------------+
|  3 | Model                                             |                    0.132397    |            0.132397    |
+----+---------------------------------------------------+--------------------------------+------------------------+
| 12 | Make_FISKER                                       |                    0.0971524   |            0.0971524   |
+----+---------------------------------------------------+--------------------------------+------------------------+
| 19 | Make_WHEEGO ELECTRIC CARS                         |                    0.0675143   |            0.0675143   |
+----+---------------------------------------------------+--------------------------------+------------------------+
| 10 | Make_CADILLAC                                     |                   -0.0572864   |            0.0572864   |
+----+---------------------------------------------------+--------------------------------+------------------------+
|  8 | Electric Utility                                  |                    0.0399863   |            0.0399863   |
+----+---------------------------------------------------+--------------------------------+------------------------+
|  7 | Legislative District                              |                    0.0293494   |            0.0293494   |
+----+---------------------------------------------------+--------------------------------+------------------------+
|  1 | Postal Code                                       |                   -0.0259876   |            0.0259876   |
+----+---------------------------------------------------+--------------------------------+------------------------+
|  0 | County                                            |                    0.000301451 |            0.000301451 |
+----+---------------------------------------------------+--------------------------------+------------------------+


Correlation table for Is Luxury Brand:
+----+---------------------------------------------------+------------------------------------+------------------------+
|    | Feature                                           |   Correlation with Is Luxury Brand |   Absolute Correlation |
+====+===================================================+====================================+========================+
| 14 | Make_MINI                                         |                        -0.718549   |             0.718549   |
+----+---------------------------------------------------+------------------------------------+------------------------+
| 16 | Make_SUBARU                                       |                        -0.467654   |             0.467654   |
+----+---------------------------------------------------+------------------------------------+------------------------+
|  4 | Electric Vehicle Type                             |                        -0.406939   |             0.406939   |
+----+---------------------------------------------------+------------------------------------+------------------------+
|  5 | Clean Alternative Fuel Vehicle (CAFV) Eligibility |                        -0.374218   |             0.374218   |
+----+---------------------------------------------------+------------------------------------+------------------------+
|  2 | Model Year                                        |                        -0.335709   |             0.335709   |
+----+---------------------------------------------------+------------------------------------+------------------------+
| 15 | Make_PORSCHE                                      |                        -0.326524   |             0.326524   |
+----+---------------------------------------------------+------------------------------------+------------------------+
| 17 | Make_TESLA                                        |                         0.283585   |             0.283585   |
+----+---------------------------------------------------+------------------------------------+------------------------+
|  3 | Model                                             |                         0.221819   |             0.221819   |
+----+---------------------------------------------------+------------------------------------+------------------------+
| 12 | Make_FISKER                                       |                        -0.21955    |             0.21955    |
+----+---------------------------------------------------+------------------------------------+------------------------+
| 10 | Make_CADILLAC                                     |                        -0.212074   |             0.212074   |
+----+---------------------------------------------------+------------------------------------+------------------------+
| 13 | Make_KIA                                          |                         0.141205   |             0.141205   |
+----+---------------------------------------------------+------------------------------------+------------------------+
|  9 | Make_BMW                                          |                         0.128453   |             0.128453   |
+----+---------------------------------------------------+------------------------------------+------------------------+
| 19 | Make_WHEEGO ELECTRIC CARS                         |                        -0.0980103  |             0.0980103  |
+----+---------------------------------------------------+------------------------------------+------------------------+
| 18 | Make_VOLVO                                        |                         0.0962832  |             0.0962832  |
+----+---------------------------------------------------+------------------------------------+------------------------+
| 11 | Make_CHRYSLER                                     |                         0.0563256  |             0.0563256  |
+----+---------------------------------------------------+------------------------------------+------------------------+
|  6 | Base MSRP                                         |                         0.0551072  |             0.0551072  |
+----+---------------------------------------------------+------------------------------------+------------------------+
|  8 | Electric Utility                                  |                         0.0416786  |             0.0416786  |
+----+---------------------------------------------------+------------------------------------+------------------------+
|  1 | Postal Code                                       |                        -0.0305481  |             0.0305481  |
+----+---------------------------------------------------+------------------------------------+------------------------+
|  7 | Legislative District                              |                         0.0173947  |             0.0173947  |
+----+---------------------------------------------------+------------------------------------+------------------------+
|  0 | County                                            |                        -0.00725148 |             0.00725148 |
+----+---------------------------------------------------+------------------------------------+------------------------+


Correlation table for Electric Range:
+----+---------------------------------------------------+-----------------------------------+------------------------+
|    | Feature                                           |   Correlation with Electric Range |   Absolute Correlation |
+====+===================================================+===================================+========================+
| 17 | Make_TESLA                                        |                        0.945243   |             0.945243   |
+----+---------------------------------------------------+-----------------------------------+------------------------+
|  2 | Model Year                                        |                       -0.931126   |             0.931126   |
+----+---------------------------------------------------+-----------------------------------+------------------------+
|  4 | Electric Vehicle Type                             |                       -0.872568   |             0.872568   |
+----+---------------------------------------------------+-----------------------------------+------------------------+
|  5 | Clean Alternative Fuel Vehicle (CAFV) Eligibility |                       -0.80981    |             0.80981    |
+----+---------------------------------------------------+-----------------------------------+------------------------+
|  9 | Make_BMW                                          |                       -0.50387    |             0.50387    |
+----+---------------------------------------------------+-----------------------------------+------------------------+
|  6 | Base MSRP                                         |                        0.405223   |             0.405223   |
+----+---------------------------------------------------+-----------------------------------+------------------------+
| 18 | Make_VOLVO                                        |                       -0.367002   |             0.367002   |
+----+---------------------------------------------------+-----------------------------------+------------------------+
| 14 | Make_MINI                                         |                       -0.268121   |             0.268121   |
+----+---------------------------------------------------+-----------------------------------+------------------------+
|  3 | Model                                             |                        0.192958   |             0.192958   |
+----+---------------------------------------------------+-----------------------------------+------------------------+
| 11 | Make_CHRYSLER                                     |                       -0.185002   |             0.185002   |
+----+---------------------------------------------------+-----------------------------------+------------------------+
| 16 | Make_SUBARU                                       |                       -0.166568   |             0.166568   |
+----+---------------------------------------------------+-----------------------------------+------------------------+
| 13 | Make_KIA                                          |                       -0.138142   |             0.138142   |
+----+---------------------------------------------------+-----------------------------------+------------------------+
| 15 | Make_PORSCHE                                      |                       -0.119692   |             0.119692   |
+----+---------------------------------------------------+-----------------------------------+------------------------+
| 12 | Make_FISKER                                       |                       -0.0662812  |             0.0662812  |
+----+---------------------------------------------------+-----------------------------------+------------------------+
| 10 | Make_CADILLAC                                     |                       -0.0654632  |             0.0654632  |
+----+---------------------------------------------------+-----------------------------------+------------------------+
|  8 | Electric Utility                                  |                        0.0519981  |             0.0519981  |
+----+---------------------------------------------------+-----------------------------------+------------------------+
|  7 | Legislative District                              |                        0.0372985  |             0.0372985  |
+----+---------------------------------------------------+-----------------------------------+------------------------+
|  1 | Postal Code                                       |                       -0.0319469  |             0.0319469  |
+----+---------------------------------------------------+-----------------------------------+------------------------+
| 19 | Make_WHEEGO ELECTRIC CARS                         |                       -0.00731009 |             0.00731009 |
+----+---------------------------------------------------+-----------------------------------+------------------------+
|  0 | County                                            |                        0.00169333 |             0.00169333 |
+----+---------------------------------------------------+-----------------------------------+------------------------+


In [72]:
Correlation Matrix:
C:\Users\sharm\AppData\Local\Temp\ipykernel_10336\988377122.py:6: FutureWarning:

this method is deprecated in favour of `Styler.format(precision=..)`

  County Postal Code Model Year Model Electric Vehicle Type Clean Alternative Fuel Vehicle (CAFV) Eligibility Electric Range Base MSRP Legislative District Electric Utility Vehicle Age Is Luxury Brand Make_BMW Make_CADILLAC Make_CHRYSLER Make_FISKER Make_KIA Make_MINI Make_PORSCHE Make_SUBARU Make_TESLA Make_VOLVO Make_WHEEGO ELECTRIC CARS
County 1.00 0.08 -0.00 0.02 -0.02 -0.05 0.00 -0.04 -0.09 0.06 0.00 -0.01 -0.02 0.03 0.05 0.00 0.06 0.01 -0.03 -0.00 -0.02 -0.05 0.04
Postal Code 0.08 1.00 0.03 -0.04 0.03 0.03 -0.03 -0.02 -0.42 -0.62 -0.03 -0.03 0.03 0.01 0.01 -0.01 -0.01 0.01 0.00 0.04 -0.03 -0.02 0.05
Model Year -0.00 0.03 1.00 -0.13 0.79 0.73 -0.93 -0.46 -0.03 -0.04 -1.00 -0.34 0.40 0.06 0.25 -0.10 0.17 0.27 0.15 0.20 -0.89 0.34 -0.07
Model 0.02 -0.04 -0.13 1.00 -0.39 -0.41 0.19 -0.09 0.01 0.03 0.13 0.22 -0.78 -0.03 0.04 -0.01 0.45 -0.19 -0.06 -0.09 0.02 0.52 0.04
Electric Vehicle Type -0.02 0.03 0.79 -0.39 1.00 0.91 -0.87 -0.15 -0.03 -0.06 -0.79 -0.41 0.56 0.09 0.25 0.09 -0.35 0.29 0.13 0.19 -0.70 0.42 0.04
Clean Alternative Fuel Vehicle (CAFV) Eligibility -0.05 0.03 0.73 -0.41 0.91 1.00 -0.81 -0.12 -0.02 -0.05 -0.73 -0.37 0.62 -0.04 -0.13 -0.05 -0.32 0.32 0.15 0.21 -0.63 0.46 -0.02
Electric Range 0.00 -0.03 -0.93 0.19 -0.87 -0.81 1.00 0.41 0.04 0.05 0.93 0.36 -0.50 -0.07 -0.19 -0.07 -0.14 -0.27 -0.12 -0.17 0.95 -0.37 -0.01
Base MSRP -0.04 -0.02 -0.46 -0.09 -0.15 -0.12 0.41 1.00 0.03 0.03 0.46 0.06 -0.08 0.05 -0.14 0.13 -0.51 -0.20 0.34 -0.14 0.53 -0.01 -0.03
Legislative District -0.09 -0.42 -0.03 0.01 -0.03 -0.02 0.04 0.03 1.00 0.24 0.03 0.02 -0.02 -0.01 -0.01 0.00 -0.01 -0.00 0.01 -0.03 0.04 0.00 -0.02
Electric Utility 0.06 -0.62 -0.04 0.03 -0.06 -0.05 0.05 0.03 0.24 1.00 0.04 0.04 -0.04 -0.02 -0.01 -0.01 0.01 -0.02 -0.00 -0.03 0.05 -0.01 -0.05
Vehicle Age 0.00 -0.03 -1.00 0.13 -0.79 -0.73 0.93 0.46 0.03 0.04 1.00 0.34 -0.40 -0.06 -0.25 0.10 -0.17 -0.27 -0.15 -0.20 0.89 -0.34 0.07
Is Luxury Brand -0.01 -0.03 -0.34 0.22 -0.41 -0.37 0.36 0.06 0.02 0.04 0.34 1.00 0.13 -0.21 0.06 -0.22 0.14 -0.72 -0.33 -0.47 0.28 0.10 -0.10
Make_BMW -0.02 0.03 0.40 -0.78 0.56 0.62 -0.50 -0.08 -0.02 -0.04 -0.40 0.13 1.00 -0.03 -0.08 -0.03 -0.20 -0.09 -0.04 -0.06 -0.39 -0.13 -0.01
Make_CADILLAC 0.03 0.01 0.06 -0.03 0.09 -0.04 -0.07 0.05 -0.01 -0.02 -0.06 -0.21 -0.03 1.00 -0.01 -0.00 -0.03 -0.01 -0.01 -0.01 -0.06 -0.02 -0.00
Make_CHRYSLER 0.05 0.01 0.25 0.04 0.25 -0.13 -0.19 -0.14 -0.01 -0.01 -0.25 0.06 -0.08 -0.01 1.00 -0.01 -0.09 -0.04 -0.02 -0.03 -0.17 -0.06 -0.01
Make_FISKER 0.00 -0.01 -0.10 -0.01 0.09 -0.05 -0.07 0.13 0.00 -0.01 0.10 -0.22 -0.03 -0.00 -0.01 1.00 -0.03 -0.01 -0.01 -0.01 -0.06 -0.02 -0.00
Make_KIA 0.06 -0.01 0.17 0.45 -0.35 -0.32 -0.14 -0.51 -0.01 0.01 -0.17 0.14 -0.20 -0.03 -0.09 -0.03 1.00 -0.10 -0.05 -0.07 -0.43 -0.15 -0.01
Make_MINI 0.01 0.01 0.27 -0.19 0.29 0.32 -0.27 -0.20 -0.00 -0.02 -0.27 -0.72 -0.09 -0.01 -0.04 -0.01 -0.10 1.00 -0.02 -0.03 -0.20 -0.07 -0.01
Make_PORSCHE -0.03 0.00 0.15 -0.06 0.13 0.15 -0.12 0.34 0.01 -0.00 -0.15 -0.33 -0.04 -0.01 -0.02 -0.01 -0.05 -0.02 1.00 -0.01 -0.09 -0.03 -0.00
Make_SUBARU -0.00 0.04 0.20 -0.09 0.19 0.21 -0.17 -0.14 -0.03 -0.03 -0.20 -0.47 -0.06 -0.01 -0.03 -0.01 -0.07 -0.03 -0.01 1.00 -0.13 -0.05 -0.00
Make_TESLA -0.02 -0.03 -0.89 0.02 -0.70 -0.63 0.95 0.53 0.04 0.05 0.89 0.28 -0.39 -0.06 -0.17 -0.06 -0.43 -0.20 -0.09 -0.13 1.00 -0.29 -0.03
Make_VOLVO -0.05 -0.02 0.34 0.52 0.42 0.46 -0.37 -0.01 0.00 -0.01 -0.34 0.10 -0.13 -0.02 -0.06 -0.02 -0.15 -0.07 -0.03 -0.05 -0.29 1.00 -0.01
Make_WHEEGO ELECTRIC CARS 0.04 0.05 -0.07 0.04 0.04 -0.02 -0.01 -0.03 -0.02 -0.05 0.07 -0.10 -0.01 -0.00 -0.01 -0.00 -0.01 -0.01 -0.00 -0.00 -0.03 -0.01 1.00
In [73]:
Selected features (Forward Selection with Simple Linear Regression): ['Model Year', 'Electric Vehicle Type', 'Clean Alternative Fuel Vehicle (CAFV) Eligibility', 'County', 'Legislative District', 'Postal Code', 'Base MSRP', 'Electric Utility']
In [74]:
Feature: Base MSRP
MSE: 4.61065838560137

Feature: Electric Range
MSE: 0.7799585910598922

Feature: Electric Utility
MSE: 6.345926593040526

Best Feature: Electric Range
MSE: 0.7799585910598922
In [75]:
Features: ('Base MSRP', 'Electric Range')
MSE: 0.7331659090181231

Features: ('Base MSRP', 'Electric Utility')
MSE: 4.600467700113733

Features: ('Electric Range', 'Electric Utility')
MSE: 0.7798374721503019

Best Features: ('Base MSRP', 'Electric Range')
MSE: 0.7331659090181231
In [76]:
Features: ('Base MSRP', 'Electric Range', 'Electric Utility')
MSE: 0.732858221045177

Best Features: ('Base MSRP', 'Electric Range', 'Electric Utility')
MSE: 0.732858221045177
In [77]:
Selected Features: ['Base MSRP', 'Electric Range']
                            OLS Regression Results                            
==============================================================================
Dep. Variable:            Vehicle Age   R-squared:                       0.874
Model:                            OLS   Adj. R-squared:                  0.874
Method:                 Least Squares   F-statistic:                     9306.
Date:                Fri, 26 Apr 2024   Prob (F-statistic):               0.00
Time:                        23:50:37   Log-Likelihood:                -3457.5
No. Observations:                2690   AIC:                             6921.
Df Residuals:                    2687   BIC:                             6939.
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
==================================================================================
                     coef    std err          t      P>|t|      [0.025      0.975]
----------------------------------------------------------------------------------
const              4.8077      0.045    107.081      0.000       4.720       4.896
Base MSRP       1.101e-05   7.69e-07     14.316      0.000    9.51e-06    1.25e-05
Electric Range     0.0244      0.000    119.955      0.000       0.024       0.025
==============================================================================
Omnibus:                     1370.130   Durbin-Watson:                   2.070
Prob(Omnibus):                  0.000   Jarque-Bera (JB):            20370.134
Skew:                           2.049   Prob(JB):                         0.00
Kurtosis:                      15.843   Cond. No.                     1.65e+05
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.65e+05. This might indicate that there are
strong multicollinearity or other numerical problems.
In [78]:
In [79]:
WARNING:pgmpy:BayesianModel has been renamed to BayesianNetwork. Please use BayesianNetwork class, BayesianModel will be removed in future.
Bayesian Network Structure:
[('County', 'Make'), ('County', 'Model'), ('County', 'Electric Vehicle Type'), ('County', 'Clean Alternative Fuel Vehicle (CAFV) Eligibility'), ('County', 'Electric Utility')]
In [80]:
WARNING:pgmpy:BayesianModel has been renamed to BayesianNetwork. Please use BayesianNetwork class, BayesianModel will be removed in future.
In [81]:
In [82]:
Model Year
1997        1
1998        1
1999        3
2000        7
2002        2
2003        1
2008       20
2010       23
2011      782
2012     1630
2013     4455
2014     3539
2015     4833
2016     5518
2017     8523
2018    14151
2019    10860
2020    11425
2021    18774
2022    27592
2023    51351
2024     3309
dtype: int64
In [83]:
In [84]:
Out[84]:
year total_cars
0 1997 1
1 1998 1
2 1999 3
3 2000 7
4 2002 2
5 2003 1
6 2008 20
7 2010 23
8 2011 782
9 2012 1630
10 2013 4455
11 2014 3539
12 2015 4833
13 2016 5518
14 2017 8523
15 2018 14151
16 2019 10860
17 2020 11425
18 2021 18774
19 2022 27592
20 2023 51351
In [85]:
In [86]:
In [87]:
In [88]:
In [89]:
Out[89]:
<seaborn.axisgrid.FacetGrid at 0x2bd885e4f10>
In [90]:
In [91]:
In [92]:
In [93]:
In [94]:
x [[1997], [1998], [1999], [2000], [2002], [2003...
y [[1], [1], [3], [7], [2], [1], [20], [23], [78...
In [95]:
x [[1997], [1998], [1999], [2000], [2002], [2003...
y [[1], [1], [3], [7], [2], [1], [20], [23], [78...
In [96]:
x [[1997], [1998], [1999], [2000], [2002], [2003...
y [[1], [1], [3], [7], [2], [1], [20], [23], [78...
In [97]:
The best degree is 10 with MSE: 35570899.08218243
In [98]:
The predicted number of cars in the year [2024] is [40584.77018535]
The predicted number of cars in the year [2025] is [45905.4836117]
The predicted number of cars in the year [2026] is [51522.51921761]
In [99]:
Enter the year you want to predict for: 2024
The predicted number of cars in the year 2024 is [40584.77018535]
In [100]:
C:\Users\sharm\anaconda3\Lib\site-packages\statsmodels\tsa\statespace\sarimax.py:966: UserWarning:

Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.

                               SARIMAX Results                                
==============================================================================
Dep. Variable:             total_cars   No. Observations:                   21
Model:                 ARIMA(5, 1, 0)   Log Likelihood                -193.988
Date:                Fri, 26 Apr 2024   AIC                            399.976
Time:                        23:53:18   BIC                            405.950
Sample:                             0   HQIC                           401.142
                                 - 21                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ar.L1          1.0684      0.216      4.957      0.000       0.646       1.491
ar.L2          0.0062      0.265      0.023      0.981      -0.513       0.525
ar.L3          0.0774      0.017      4.462      0.000       0.043       0.111
ar.L4         -0.8118      0.300     -2.707      0.007      -1.399      -0.224
ar.L5          0.6599      0.297      2.219      0.027       0.077       1.243
sigma2      9.885e+06   3.74e-08   2.65e+14      0.000    9.89e+06    9.89e+06
===================================================================================
Ljung-Box (L1) (Q):                   0.29   Jarque-Bera (JB):                 0.83
Prob(Q):                              0.59   Prob(JB):                         0.66
Heteroskedasticity (H):          315010.99   Skew:                            -0.03
Prob(H) (two-sided):                  0.00   Kurtosis:                         4.00
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
[2] Covariance matrix is singular or near-singular, with condition number 4.16e+29. Standard errors may be unstable.
                 0
count    21.000000
mean    969.084405
std    3486.557462
min   -7720.478264
25%      -8.797337
50%       3.476790
75%    2803.799189
max    8317.490712
C:\Users\sharm\AppData\Local\Temp\ipykernel_10336\1837273610.py:166: FutureWarning:

pivot_table dropped a column because it failed to aggregate. This behavior is deprecated and will raise in a future version of pandas. Select only the columns that can be aggregated.

C:\Users\sharm\AppData\Local\Temp\ipykernel_10336\1837273610.py:166: FutureWarning:

pivot_table dropped a column because it failed to aggregate. This behavior is deprecated and will raise in a future version of pandas. Select only the columns that can be aggregated.

C:\Users\sharm\AppData\Local\Temp\ipykernel_10336\1837273610.py:166: FutureWarning:

pivot_table dropped a column because it failed to aggregate. This behavior is deprecated and will raise in a future version of pandas. Select only the columns that can be aggregated.

C:\Users\sharm\AppData\Local\Temp\ipykernel_10336\1837273610.py:166: FutureWarning:

pivot_table dropped a column because it failed to aggregate. This behavior is deprecated and will raise in a future version of pandas. Select only the columns that can be aggregated.

In [ ]:
In [ ]:
Save